Ref 

# 


Hits 


Search Query 


DBs 


Default 
Operator 


Plurals 


Time Stamp 


LI 


36 


(reexecut$9 re-execut$9) near9 
((out adj order$5) hazard$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJDB 


OR 


OFF 


2004/11/17 15:11 


12 


739 


(712/23).CCLS. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTJB 


OR 


OFF 


2004/11/17 14:24 


L3 


277 


(712/233).CCLS. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJDB 


OR 


OFF 


2004/11/17 14:24 


L4 


49326 


out adj order$3 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTJB 


OR 


OFF 


2004/11/17 14:27 


L5 


239 


2 and 4 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJDB 


OR 


OFF 


2004/11/17 14:25 


L6 


44 


3 and 4 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJDB 


OR 


OFF 


2004/11/17 14:26 


L7 


; 154 


(load adj instruction$3) near3 t 
branch$5 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJDB 


OR 


OFF 


2004/11/17 14:27 


L8 


50 


((out adj order$3) out-of-order$5) 
near3 load adj instruction$3 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJDB 


OR 


OFF 


2004/11/17 14:28 
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L9 


468 


(712/225).CCLS. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF 


2004/11/17 14:28 


L10 


11 


9 and 8 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 
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IBMJTDB 


OR 


OFF 


2004/11/17 14:40 


Lll 


425 


(712/216).CCLS. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF 


2004/11/17 14:40 


L12 


12 


8 and 11 ' " 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF "•• 


2004/11/17 14:40 


L13 


24 


(performance$3 near5 penalty 
near9 improv$5) and "712"/$.ccls. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/17 15:16 


L14 


28 


(performance$3 near5 improv$5 
near3 significant$6 near8 
processor$5) and "712"/$.ccls. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/17 15:17 


L15 


143 


(performance$3 near5 improv$5 
near3 significant$6 near8 
processor$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/17 15:17 


L16 




("6725358").PN. 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB , 


OR 


OFF 


2004/11/17 15:19 


L20 


206 


(load adj instruction$5) near 
((subsequen$6 follow following) adj 
load adj instruction$3) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/17 15:25 
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SI 


4 


(("5913048") or ("6360314")).PN. 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


OFF 


2004/11/17 14:18 


S2 


302 


glew.in. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/16 17:12 


S3 


65 


glew.in. and "712'7$.cds. 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/16 15:12 


S4 


2 


("5951670").PN. 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


OFF 


2004/11/16 15:32 


S5 


2 


("6725358").PN. 


US-PGPUB; 
USPAT; 
EPO; JPO; 
DERWENT; 
IBMJTDB 


OR 


OFF 


2004/11/17 15:19 


S6 


111 


(re-execut$6 reexecut$6) near9 
(load adj instruction$5) 


US-PGPUB; 

USPAT;: 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/16 17:28 


S7 


2 


(re-execut$6 reexecut$6) near9 
(first adj load adj instruction$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/16 17:29 


S8 


1 


(re-execut$6 reexecut$6) near9 
(one adj joad adj instruction$5) ■ 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/16 17:29 


S9 


1 


(re-execut$6 reexecut$6) near9 
(preced$5 adj load adj 
instruction$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMJTDB 


OR 


OFF 


2004/11/16 17:36 
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US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF 


2004/11/16 17:36 


Sll 


0 


(first adj load adj instruction$3) 
near9 add near9 store$5 near9 
(second adj load adj instruction$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBMTDB 


OR 


OFF 


2004/11/16 17:43 


S12 
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(first adj load adj instruction$3) 
with add with store$5 with (second 
adj load adj instruction$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF 


2004/11/16 17:47 


SI3 


0 


(first adj' load adj instruction$3) 
with (second adj load adj 
:' instruction$5) with ('same 1 near3 
target$5) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF™ 


2004/11/16 17:48 
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(first adj load adj instruction$3) 
with (second adj load adj 
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target$5 near3 address$6) 


US-PGPUB; 

USPAT; 

USOCR; 

EPO; JPO; 

DERWENT; 

IBM_TDB 


OR 


OFF 


2004/11/16 17:48 
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Quick Tips 

• Enter your search terms in lower case with a space between the terms. 

sales offices 

You can also enter a full question or concept in plain language. 
Where are the sales offices? 

• Capitalize proper nouns to search for specific people, places, or 
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results will be. 
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1 Predictive techniques for aggressive load speculation 
Glenn Reinman, Brad Calder 

November 1998 Proceedings of the 31st annual ACM/IEEE international symposium on 
M i croa rch itect u re 

Full text available: ^ pdf(1.94 MB) Additional Information: full citation , references , citings , index terms 



On pipelining dynamic instruction scheduling lo gic 
Jared Stark, Mary D. Brown, Yale N. Patt 

December 2000 Proceedings of the 33rd annual ACM/IEEE international symposium on 
M i croa rch itect u re 

Full text available: "fg| pdfd 28.82 KB) 

ps(543.84 KB) Additional Information: full citation , references , citings , index terms 

1 Publisher Site 



Architecture: Scalable selective re-execution for EDGE architectures 
Rajagopalan Desikan, Simha Sethumadhavan, Doug Burger, Stephen W. Keckler 
October 2004 Proceedings of the 11th international conference on Architectural 
support for programming languages and operating systems 

Full text available: ^ pdf(214.38 KB) Additional Information: full citation , abstract , references , index terms 

Pipeline flushes are becoming increasingly expensive in modern microprocessors with large 
instruction windows and deep pipelines. Selective re-execution is a technique that can 
reduce the penalty of mis-speculations by re-executing only instructions affected by the mis- 
speculation, instead of all instructions. In this paper we introduce a new selective re- 
execution mechanism that exploits the properties of a dataflow-like Explicit Data Graph 
Execution (EDGE) architecture to support efficient mis ... 

Keywords: EDGE architectures, load-store dependence prediction, mis-speculation 
recovery, selective re-execution, selective replay, speculative dataflow machines 



Unconstrained speculative execution with predicated state buffering 
Hideki Ando, Chikako Nakanishi, Tetsuya Hara, Masao Nakaya 

May 1995 ACM SIGARCH Computer Architecture News , Proceedings of the 22nd 
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annual international symposium on Computer architecture, volume 23 issue 2 

Full text available: j p) P df(1.50 MB) Additional Information: full citation , abstract, references , citings, index 
^ terms 

Speculative execution is execution of instructions before it is known whether these 
instructions should be executed. Compiler-based speculative execution has the potential to 
achieve both a high instruction per cycle rate and high clock rate. Pure compiler-based 
approaches, however, have greatly limited instruction scheduling due to a limited ability to 
handle side effects of speculative execution. Significant performance improvement is, thus, 
difficult in non-numerical applications. This paper ... 

5 Sentinel scheduling for VLIW and superscalar processors |jj 
Scott A. Mahlke, William Y. Chen, Wen-mei W. Hwu, B. Ramakrishna Rau, Michael S. 
Schlansker 

September 1992 ACM SIGPLAN Notices , Proceedings of the fifth international 

conference on Architectural support for programming languages and 
operating systems, Volume 27 issue 9 

Full text available- 1| |pdf(1.22 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

Speculative execution is an important source of parallelism for VLIW and superscalar 
processors. A serious challenge with compiler-controlled speculative execution is to 
accurately detect and report all program execution errors at the time of occurrence. In this 
paper, a set of architectural features and compile-time scheduling support referred to as 
sentinel scheduling is introduced. Sentinel scheduling provides an effective framework for 
compiler-controlled speculative ex ... 

6 Classifying load and store instructions for memory renaming Q 
Glenn Reinman, Brad Calder, Dean Tullsen, Gary Tyson, Todd Austin 

May 1999 Proceedings of the 13th international conference on Supercomputing 

Full text available: | ||pdfn.37 MB) Additional Information: full citation , references , citings, index terms 



7 Speculative execution: Enhancing memory level parallelism via recovery-free value 
prediction 

Huiyang Zhou, Thomas M. Conte 

June 2003 Proceedings of the 17th annual international conference on 
Supercomputing 

Full text available: ^ pdf(302.33 KB) Additional Information: full citation , abstract , references , index terms 

The ever-increasing computational power of contemporary microprocessors reduces the 
execution time spent on arithmetic computations (i.e., the computations not involving slow 
memory operations such as cache misses) significantly. Therefore, for memory intensive 
workloads, it becomes more important to overlap multiple cache misses than to overlap slow 
memory operations with other computations. In this paper, we propose a novel technique to 
parallelize sequential cache misses, thereby increasing m ... 

Keywords: memory disambiguation, memory level parallelism, prefetching, recovery-free 
value prediction 
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Zero-c ycle loads: microarchitecture support for reducing load latency 
Todd M. Austin, Gurindar S. Sohi 

December 1995 Proceedings of the 28th annual international symposium on 
M i c roa rch itect u re 
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9 Streamlining data cache access with fast address calculation U 
Todd M. Austin, Dionisios N. Pnevmatikatos, Gurindar S. Sohi 

May 1995 ACM SIGARCH Computer Architecture News , Proceedings of the 22nd 

annual international symposium on Computer architecture, Volume 23 issue 2 

r- ii* * i ui a J,/, co Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdfd .58 MB) 

ie=r terms 

For many programs, especially integer codes, untolerated load instruction latencies account 
for a significant portion of total execution time. In this paper, we present the design and 
evaluation of a fast address generation mechanism capable of eliminating the delays caused 
by effective address calculation for many loads and stores. Our approach works by predicting 
early in the pipeline (part of) the effective address of a memory access and using this 
predicted address to speculatively access the ... 

10 Architectural power estimation and optimization: Power-aware issue queue design for Q 
speculative instructions 

Tali Moreshet, R. Iris Bahar 

June 2003 Proceedings of the 40th conference on Design automation 

Full text available: ^ pdfd 57.45 KB) Additional Information: full citation , abstract , references , index terms 

Speculatively issued instructions may be particularly sensitive to increases in pipeline depth. 
Our results indicate that as pipeline depth increases, speculation increases the percentage of 
issue queue instructions that are waiting to be potentially re-issued in case of a mis- 
speculation. To compensate, issue queues are larger and thus more power hungry. We 
propose an alternative design called the Dual Issue Queue, that retains pre- and post-issue 
instructions in separate, smaller queues ... 

Keywords: low power design, microarchitecture, speculation 



11 Compiler Optimization of Memory-Resident Value Communication Between 
Speculative Threads 

Antonia Zhai, Christopher B. Colohan, J. Gregory Steffan, Todd C. Mowry 
March 2004 Proceedings of the international symposium on Code generation and 
optimization: feedback-directed and runtime optimization 

Full text available: ^ pdf(257.99 KB) Additional Information: full citation , abstract 

Efficient inter-thread value communication is essential for improving performance in Thread- 
Level Speculation (TLS). Although several mechanisms for improving value communication 
using hardware support have been proposed, there is relatively little work onexploiting the 
potential of compiler optimization. Building on recent research on compiler optimization of 
scalar value communication between speculative threads, we propose compiler techniques 
for the optimization of memory-resident values. In T ... 

12 Compiler controlled value prediction using branch predictor based confidence 
Eric Larson, Todd Austin 

December 2000 Proceedings of the 33rd annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available: jg[ pdf(236.58 KB) 

ps(850.71 KB) Additional Information: full citation , references , index terms 
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13 Data prefetching by dependence graph precomputation 
Murali Annavaram, Jignesh M. Patel, Edward S. Davidson 

May 2001 ACM SIGARCH Computer Architecture News , Proceedings of the 28th 

annual international symposium on Computer architecture, Volume 29 issue 2 

Full text available - 13 df(909 40 KB) Additional Information: full citation , abstract , references , citings , index 
-T£|_r_j : terms 

Data cache misses reduce the performance of wide-issue processors by stalling the data 
supply to the processor Prefetching data by predicting the miss address is one way to 
tolerate the cache miss latencies. But current applications with irregular access patterns 
make it difficult to accurately predict the address sufficiently early to mask large cache miss 
latencies. This paper explores an alternative to predicting prefetch addresses, namely 
precomputing them. The Dependence Graph Pr ... 

14 Enhancing software reliability with speculative threads 
Jeffrey Oplinger, Monica S. Lam 

October 2002 Proceedings of the 10th international conference on Architectural 

support for programming languages and operating systems, volume 37 , 36 , 
30 Issue 10 , 5 , 5 

Full text available: ^pdfd.47 MB) Additional Information: full citation , abstract , references , citings 

This paper advocates the use of a monitor-and-recover programming paradigm to enhance 
the reliability of software, and proposes an architectural design that allows software and 
hardware to cooperate in making this paradigm more efficient and easier to program. We 
propose that programmers write monitoring functions assuming simple sequential execution 
semantics. Our architecture speeds up the computation by executing the monitoring 
functions speculatively in parallel with the main computation. For ... 

15 Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window 
Processors 

Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan 

December 2003 Proceedings of the 36th Annual IEEE/ACM International Symposium on 
M i croa rc h itect u re 

Full text available: f9 pdf(419.08 KB) 

JsT *" Additional Information: full citation , abstract , citings , index terms 

Publisher Site 

Large instruction window processors achieve high performance by exposing large amounts 
of instruction levelparallelism. However, accessing large hardware structurestypically 
required to buffer and process such instructionwindow sizes significantly degrade the cycle 
time. This paper proposes a novel Checkpoint Processing and Recovery(CPR) 
microarchitecture, and shows how to implement alarge instruction window processor without 
requiring largestructures thus permitting a high clock frequency. We fo ... 

16 Value-based clock gating and operation packing: dynamic strategies for improving | 
processor power and performance 

David Brooks, Margaret Martonosi 

May 2000 ACM Transactions on Computer Systems (TOCS), Volume 18 issue 2 

Full text available* H3 odf(210 51 KB) Additional Information: full citation , abstract , references , citings , index 

: terms 

The large address space needs of many current applications have pushed processor designs 
toward 64-bit word widths. Although full 64-bit addresses and operations are indeed 
sometimes needed, arithmetic operations on much smaller quantities are still more 
common. In fact, another instruction set trend has been the introduction of instructions 
geared toward subword operations on 16-bit quantities. For examples, most major 
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processors now include instruction set support for multimedia operation ... 

17 Register integration: a simple and efficient implementation of squash reuse 
Amir Roth, Gurindar S, Sohi 

December 2000 Proceedings of the 33rd annual ACM/IEEE international symposium on 
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Full text available: g pdfd 54.98 KB) 
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